In [202]:
import pandas as pd
import numpy as np 
import gurobipy as gp
from gurobipy import GRB
from shapely import wkt
import os
import geopandas as gpd
import folium

import contextily as cx
from IPython.core.interactiveshell import InteractiveShell

from shapely.geometry import Point
In [163]:
print(os.getenv('GRB_LICENSE_FILE'))
None
In [164]:
params = {
"WLSACCESSID": 'c819885e-2631-47d7-8108-8392e184ea0c',
"WLSSECRET": '534f1fd3-e9f5-4cd1-ac31-080d6502abf5',
"LICENSEID": 2398019,
}
env = gp.Env(params=params)

# Create the model within the Gurobi environment
model = gp.Model(env=env)
Set parameter WLSAccessID
Set parameter WLSSecret
Set parameter LicenseID to value 2398019
Academic license - for non-commercial use only - registered to hk939@georgetown.edu

Optimization of building energy efficiency interventions using BlocPower data

Introduction

Building decarbonization is a critical component of the multiple transitions needed to combat climate change. The Inflation Reduction Act of 2022 provides unprecedented incentives for households to install home improvements for electrification and energy efficiency, like installing heat pumps instead of fossil-fuel powered heaters and weatherization/insulation (https://www.nytimes.com/interactive/2023/climate/tax-breaks-inflation-reduction-act.html)

These efforts have been aided by technological advances in the form of fine-grained building-level data, recently available at nationwide scale. One such dataset - developed by BlocPower - uses data on building type and equipment (single -family/multi-family, heating and cooling systems) sourced from tax assessment records. It then uses them as inputs for energy modeling algorithms/software developed by the Department of Energy. Crucially, this allows estimation of counterfactuals, i.e. the reduction in energy usage that would result from targeted retrofits. BlocPower’s own core operation takes building data for a city, uses their proprietary algorithm to estimate the ideal intervention, potential cost of a retrofit, and projected reduction in energy use (it also provides financing options).

Finance is the key variable in these ambitious transition plans. Many of the new policies provide incentives like tax credits and rebates, instead of direct transfers to finance retrofits. However, it is useful to get an estimate of the total cost required to achieve a specified reduction in energy use and emissions from buildings. (This is also useful where governments are spending money directly to decarbonize their own building portfolios). https://www.whitehouse.gov/briefing-room/statements-releases/2022/12/07/fact-sheet-biden-harris-administration-announces-first-ever-federal-building-performance-standard-catalyzes-american-innovation-to-lower-energy-costs-save-taxpayer-dollars-and-cut-emissions/

The set of possible interventions for each building can consist of:

  1. Weatherization
  2. Energy efficiency retrofits/electrification
  3. Rooftop solar

Linear programming formulation

Despite some differences in these particular interventions, the basic structure requires an upfront cost to be spent on a building, after which its energy usage will go down by some factor (i.e. there will be energy savings). For our current purpose, let weatherization be the only building intervention we consider. We abstract away the exact financing mechanisms included in the IRA and BIL (rebates and incentives), and assume that the local government has a fixed budget to allocate for weatherizing buildings. The objective is to maximize energy savings - or equivalently, reduce emissions - subject to the constraint that the total cost cannot be larger than the budget. For this, we need recommendations of which buildings to target.

Linear programming is a well-established mathematical tool used for policy purposes, starting from optimizing military logistics during WW2 to optimizing the energy source mixture (see appendix). This problem formulation fits well with a linear programming framework. Specifically, this is an example of a Mixed-Integer Programming (MIP) problem, as the solution must be an integer (we can't weatherize 3.6 buildings).

Read in geocoded building data (Madisonville, KY)

In [249]:
df = pd.read_csv(r"D:\Work\Georgetown\acad\mdi\summer_research\bp\geocoded_kentucky_zip_buildings.csv")  # Reference any table in this project

test = df[['geocoded', 'building_id' , 'building_type' , 'heating_fuel_type', 'total_site_energy_GJ',
            "address", "area_sq_ft", "energy_use_intensity"]]

test = test[test['heating_fuel_type'] != 'Unknown']
test = test.dropna(subset=['heating_fuel_type'])
#test = test.sample(frac=0.8).reset_index()
test['geocoded'] = test['geocoded'].str.replace('(','').str.replace(')','') # Remove parentheses
test[['lat', 'lon']] = test['geocoded'].str.split(',', expand=True)
test['lat'] = test['lat'].astype(float)
test['lon'] = test['lon'].astype(float)

# Convert latitude and longitude to a point
test['geometry'] = test.apply(lambda row: Point(row.lon, row.lat), axis=1)
C:\Users\HK\AppData\Local\Temp\ipykernel_1480\2571606248.py:9: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.
  test['geocoded'] = test['geocoded'].str.replace('(','').str.replace(')','') # Remove parentheses

Model 1: Basic

Model Formulation

Google Doc: https://docs.google.com/document/d/1JikyyMgS6zkjCQZMJQrtv8sU_ut1Uf0n8xkclPbHVg8/edit

Paper: https://www.sciencedirect.com/science/article/pii/S0306261922010510

Sets and Indices

$i \in T$: Index and set of potential buildings to weatherize.

Parameters

$c_{i} \in \mathbb{R}^+$: The cost of weatherizing building $i$.

$e_{i} \in \mathbb{R}^+$: The energy use of building $i$.

Decision Variables

$w_{i} \in [0, 1]$: This variable is equal to 1 if building $i$ is weatherized; and 0 otherwise.

Objective Function

  • Energy savings. We seek to maximize the total energy savings from all buildings.
\begin{equation} \text{Max} \quad Z = \sum_{i \in T} e_{i} \cdot w_{i} \tag{1} \end{equation}

Constraints

  • Budget. We need to ensure that the total cost of interventions do not exceed the allocated budget.
\begin{equation} \sum_{i \in T} c_{i} \cdot w{i} \leq \text{budget} \tag{2} \end{equation}

Budget = USD 3,000,000

Parameters

Costs:

The complication is that the cost $c_{i}$ and energy savings $e_{i}$ is different for each building, based on the chosen intervention and the building's own characteristics. Installing a heat pump or weatherizing in a gas-heating building of 6,000 sq. ft. would have a different cost from installing the same equipment in an oil-heating building of 3,000 sq. ft. Depending on local labor and material costs, even the exact same project on comparable buildings would have different costs in Wichita and Ithaca.

Industry experts have the following input:

  • Engie (Andrew Ludwig): 'Because of the wide variations, a heuristic approach is probably the best you can do.

  • BlocPower (Ankur Garg): 'The cost estimation is a process which requires local data, personnel hours and itself has an expense associated with it'

Energy savings:

The original paper by Heleno et al (2022) seems to use a heuristic approach, by calculating cost and savings factors from the Weatherization Assistance Program (WAP) for different types of building archetypes. https://www.sciencedirect.com/science/article/pii/S0306261922010510

For this current example, we assume the following factors. So, weatherizing a gas-heated building reduces energy use by 4% and costs 2000 USD.

  • Energy savings factors:
    • Gas Buildings = 0.96
    • Oil Buildings = 0.98
  • Cost:
    • Gas buildings = 2000
    • Oil Buildings = 3000

Multiply the above factors with the cost and energy columns to obtain 2 new columns per building, ei (energy savings after weatherization) and ci (cost of weatherization).

In [250]:
test = test[test['heating_fuel_type'].isin(['Oil','Gas'])]

test['energy_savings'] = test.apply(lambda row: row['total_site_energy_GJ'] - (0.96 * row['total_site_energy_GJ'])
                                    if row['heating_fuel_type'] == 'Oil' 
                                    else (row['total_site_energy_GJ'] - 
                                    (0.98 * row['total_site_energy_GJ']) if row['heating_fuel_type'] == 'Gas' 
                                    else row['total_site_energy_GJ']), axis=1)

test['cost'] = test.apply(lambda row: 3000
                                    if row['heating_fuel_type'] == 'Oil' 
                                    else (2000), axis=1)
test = test.reset_index()

Optimize in Gurobi

In [272]:
# Create a new model
m = gp.Model("mip1", env=env)

# Add variables
w = m.addVars(test.index, vtype=GRB.BINARY, name="w")

# Set objective
m.setObjective(gp.quicksum(w[i]*test['energy_savings'].iloc[i] for i in test.index), GRB.MAXIMIZE)

# Add constraint: sum of w[i]*c_i <= budget
m.addConstr(gp.quicksum(w[i]*test['cost'].iloc[i] for i in test.index) <= 3000000)

# Optimize model
m.optimize()

# Check optimization status
selected_indices = [i for i in test.index if w[i].x > 0]  # Store selected indices

# Add 'opt' column to the dataframe
test['opt'] = np.where(test.index.isin(selected_indices), 1, 0)

# Calculate and print final energy savings and cost of intervention
energy_savings = sum(w[i].x * test['energy_savings'].iloc[i] for i in selected_indices)
cost = sum(w[i].x * test['cost'].iloc[i] for i in selected_indices)

print("Energy savings achieved:", energy_savings)
print("Cost of intervention:", cost)
print('Buildings weatherized:', test[(test.opt==1)].shape[0] )
Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (win64)

CPU model: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 6 physical cores, 12 logical processors, using up to 12 threads

Academic license - for non-commercial use only - registered to hk939@georgetown.edu
Optimize a model with 1 rows, 12625 columns and 12625 nonzeros
Model fingerprint: 0x9b33a4e9
Variable types: 0 continuous, 12625 integer (12625 binary)
Coefficient statistics:
  Matrix range     [2e+03, 3e+03]
  Objective range  [4e-01, 1e+03]
  Bounds range     [1e+00, 1e+00]
  RHS range        [3e+06, 3e+06]
Found heuristic solution: objective 17384.615178
Presolve removed 0 rows and 682 columns
Presolve time: 0.08s
Presolved: 1 rows, 11943 columns, 11943 nonzeros
Variable types: 0 continuous, 11943 integer (11285 binary)
Found heuristic solution: objective 21321.414578

Root relaxation: objective 6.184998e+04, 1 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 61849.9811    0    1 21321.4146 61849.9811   190%     -    0s
H    0     0                    61839.613522 61849.9811  0.02%     -    0s
H    0     0                    61846.451522 61849.9811  0.01%     -    0s

Explored 1 nodes (1 simplex iterations) in 1.53 seconds (0.08 work units)
Thread count was 12 (of 12 available processors)

Solution count 4: 61846.5 61839.6 21321.4 17384.6 

Optimal solution found (tolerance 1.00e-04)
Best objective 6.184645152222e+04, best bound 6.184998112183e+04, gap 0.0057%
Energy savings achieved: 61846.45152221681
Cost of intervention: 3000000.0
Buildings weatherized: 1447

Energy savings achieved: 61656.41 GJ

Cost of intervention: USD 3000000

Buildings weatherized: 1442

Visualize recommendations on map

In [273]:
# gdf = df[['building_id', 'geocoded', 'area_sq_ft', 'address', 'energy_use_intensity']].copy()

# gdf = gdf.round(2)
# # merge to optimized dataframe
# gdf = gdf.merge(test, left_on='building_id', right_on='building_id', how='inner')

# Convert the DataFrame to a GeoDataFrame
gdf = gpd.GeoDataFrame(test, geometry='geometry', crs='epsg:4326')

# create retrofit column
gdf['Retrofit?'] = gdf['opt'].apply(lambda x: 'YES' if x == 1 else 'NO')

#round energy savings column
gdf['energy_savings'] = gdf['energy_savings'].round(3)

m1 = gdf.explore("opt", #tooltip=False, #categorical=True, 
                 tooltip=["address", "area_sq_ft", "energy_use_intensity",
                        "building_type", "heating_fuel_type", "energy_savings", "cost", "Retrofit?"])

 # Set `preferCanvas` to optimize performance of map
m1.options["preferCanvas"] = True

#set a tile layer - OSM
folium.TileLayer("OpenStreetMap").add_to(m1)
m1
Out[273]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Model 2: Adding disadvantage and J40 criteria

This also offers an interesting way to analyse the implications of recent federal policies that seek to prioritise investments in census tracts identified as 'disadvantaged'.

Sets and Indices

$i \in T$: Index and set of potential buildings to weatherize.

$j \in Z$: Index and set of tracts

Parameters

$c_{ij} \in \mathbb{R}^+$: The cost of weatherizing building $i$ in zipcode $j$

$e_{ij} \in \mathbb{R}^+$: The energy savings of building $i$ in zipcode $j$

$dis_{j} \in {0,1}$: The disadvantage rating of a tract $j$

Decision Variables

$w_{ij} \in {0, 1 }$: This variable is equal to 1 if building $i$ in tract $j$ is weatherized; and 0 otherwise.

Objective Function(s)

  • Energy savings. We seek to maximize the total energy savings from all buildings.
\begin{equation} \text{Max} \quad E = \sum_{{i \in T}{j \in Z}} e_{ij} \cdot w_{ij} \tag{3} \end{equation}

Constraints

    1. Budget. Total cost of interventions should not exceed the allocated budget.
\begin{equation} \sum_{i \in T} c_{ij} \cdot w_{ij} \leq \text{budget} \tag{4} \end{equation}
    1. Justice40. 40% of total funding for building interventions should go to buildings in disadvantaged tracts.
\begin{equation} \sum_{i \in T} \sum_{j \in Z | dis_j = 1} c_{ij} \cdot w_{ij} \geq 0.4 \times \text{budget} \tag{5} \end{equation}
In [267]:
#read in shapefile of Kentucky CEJST
kycj = pd.read_csv(r"D:\Work\Georgetown\acad\mdi\summer_research\bp\ky_cejst_shp.csv") 

kycj['geom'] = kycj['geometry'].apply(wkt.loads)

#make geodataframe
kycj = gpd.GeoDataFrame(kycj, geometry='geom', crs='epsg:4326')

#spatial join with CEJST with building coordinates
gdf_cj = gpd.sjoin(left_df=gdf, right_df=kycj, how='inner', predicate = 'intersects')

gdf_cj['geometry_right'] = gdf_cj['geometry_right'].apply(wkt.loads)


gdf_cj = gpd.GeoDataFrame(gdf_cj, geometry='geometry_right', crs='epsg:4326')

gdf_cj['disadv'] = gdf_cj['CC'].apply(lambda x: 1 if x >0 else 0)
In [274]:
# plot

fig, ax = plt.subplots(figsize=(12,10))

gdf_cj.plot('CC', ax=ax, cmap='Blues', categorical=True, legend=True,
            edgecolor='k', alpha=0.5, linewidth=0.2)

# legend_kwds={"label": "kBTU/sq. ft", 
#                          "orientation": "horizontal", 
#                          "shrink": 0.6, 
#                          "aspect": 30,  # Adjust the length of legend
#                          "pad": 0.02,   # Adjust the space between the plot and the legend
#                          "fraction": 0.046,  # Adjust the height of the legend
#                        })  # Increase the fontsize for the colorbar ticks here)


gdf.plot(column='energy_use_intensity', ax=ax, alpha=0.7, cmap='autumn')
          
plt.axis('off')
ax.set_xticks([])
ax.set_yticks([])
ax.set_title('Census tracts by disadvantage thresholds exceeded (blue), overlayed with buildings (red)', fontsize=18)
Out[274]:
Text(0.5, 1.0, 'Census tracts by disadvantage thresholds exceeded (blue), overlayed with buildings (red)')

Optimize in Gurobi

In [255]:
# Create a new model
m = gp.Model("mip2", env=env)

# Add binary variables for each building
w = m.addVars(gdf_cj.index, vtype=GRB.BINARY, name="w")

# Set objective to maximize energy savings
m.setObjective(gp.quicksum(w[i] * gdf_cj.loc[i, 'energy_savings'] for i in gdf_cj.index), GRB.MAXIMIZE)

# Add budget constraint: total cost should not exceed the budget
m.addConstr(gp.quicksum(w[i] * gdf_cj.loc[i, 'cost'] for i in gdf_cj.index) <= 3000000)

# Add J40 constraint: at least 40% of the budget should be used in disadvantaged buildings
disadvantaged_buildings = gdf_cj[gdf_cj['disadv'] == 1].index
m.addConstr(gp.quicksum(w[i] * gdf_cj.loc[i, 'cost'] for i in disadvantaged_buildings) >= 0.4 * 3000000)

# Optimize model
m.optimize()

# Get the selected indices
selected_indices = [i for i in gdf_cj.index if w[i].x > 0.5]


# Add 'opt' column to the dataframe
gdf_cj['opt'] = np.where(gdf_cj.index.isin(selected_indices), 1, 0)

# Calculate and print final energy savings and cost of intervention
energy_savings = sum(w[i].x * gdf_cj.loc[i, 'energy_savings'] for i in selected_indices)
cost = sum(w[i].x * gdf_cj.loc[i, 'cost'] for i in selected_indices)

print("Energy savings achieved:", energy_savings)
print("Cost of intervention:", cost)
print('Buildings weatherized:', gdf_cj[(gdf_cj.opt==1)].shape[0] )
Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (win64)

CPU model: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 6 physical cores, 12 logical processors, using up to 12 threads

Academic license - for non-commercial use only - registered to hk939@georgetown.edu
Optimize a model with 2 rows, 12625 columns and 16370 nonzeros
Model fingerprint: 0xf4724688
Variable types: 0 continuous, 12625 integer (12625 binary)
Coefficient statistics:
  Matrix range     [2e+03, 3e+03]
  Objective range  [4e-01, 1e+03]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+06, 3e+06]
Found heuristic solution: objective 18814.898000
Presolve removed 0 rows and 1868 columns
Presolve time: 0.10s
Presolved: 2 rows, 10757 columns, 14263 nonzeros
Variable types: 0 continuous, 10757 integer (9167 binary)
Found heuristic solution: objective 20780.183000

Root relaxation: objective 6.165825e+04, 3 iterations, 0.01 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 61658.2490    0    1 20780.1830 61658.2490   197%     -    0s
H    0     0                    61656.413000 61658.2490  0.00%     -    0s

Explored 1 nodes (3 simplex iterations) in 0.32 seconds (0.10 work units)
Thread count was 12 (of 12 available processors)

Solution count 3: 61656.4 20780.2 18814.9 

Optimal solution found (tolerance 1.00e-04)
Best objective 6.165641300000e+04, best bound 6.165824900000e+04, gap 0.0030%
Energy savings achieved: 61656.41299999999
Cost of intervention: 3000000.0
Buildings weatherized: 1442

Energy savings achieved: 61656.41 GJ

Cost of intervention: USD 3000000

Buildings weatherized: 1442

In [256]:
# sum of total costs in disadvantaged tracts vs others
gdf_cj[(gdf_cj.opt==1)].groupby('disadv')['cost'].sum()
Out[256]:
disadv
0    1800000
1    1200000
Name: cost, dtype: int64

END HERE

Widget/tool

In [257]:
# interactive

import ipywidgets as widgets
from IPython.display import display

budget_input = widgets.BoundedFloatText(
    value=3000000,
    min=0,
    max=1e10,
    step=1000,
    description='Budget:',
    disabled=False
)

display(budget_input)

button = widgets.Button(description="Optimize")
output = widgets.Output()
display(button, output)


def on_button_clicked(b):
    with output:
        # Clear the output
        output.clear_output()

        # Get the budget value
        budget = budget_input.value

        # Place your optimization code here. For example:

        ###########################################################
        # Create a new model
        m = gp.Model("mip1", env=env)

        # Add variables
        w = m.addVars(test.index, vtype=GRB.BINARY, name="w")

        # Set objective
        m.setObjective(gp.quicksum(w[i]*test['energy_savings'].iloc[i] for i in test.index), GRB.MAXIMIZE)

        # Add constraint: sum of w[i]*c_i <= budget
        m.addConstr(gp.quicksum(w[i]*test['cost'].iloc[i] for i in test.index) <= budget)

        # Optimize model
        m.optimize()

        # Check optimization status
        selected_indices = [i for i in test.index if w[i].x > 0]  # Store selected indices

        # Add 'opt' column to the dataframe
        test['opt'] = np.where(test.index.isin(selected_indices), 1, 0)

        # Calculate and print final energy savings and cost of intervention
        energy_savings = sum(w[i].x * test['energy_savings'].iloc[i] for i in selected_indices)
        cost = sum(w[i].x * test['cost'].iloc[i] for i in selected_indices)

        print("Energy savings achieved:", energy_savings)
        print("Cost of intervention:", cost)        # Place your visualization code here
        print('Buildings weatherized:', test[(test.opt==1)].shape[0] )
        #####################################################################################################
        return test

button.on_click(on_button_clicked)
In [ ]: